The ABFM begins to use differential item functioning.

نویسندگان

  • Thomas R O'Neill
  • Michael R Peabody
  • James C Puffer
چکیده

The American Board of Family Medicine (ABFM) believes that it is important to have evidence to show that the pass/fail decisions related to its examinations are based on accurate determination of the minimum knowledge necessary to be a boardcertified family physician and, furthermore, that these decisions are unbiased against any particular subset of the population. Accordingly, as part of the ABFM’s commitment to continuously improve the Maintenance of Certification for Family Physicians (MC-FP) process, the ABFM has started using differential item functioning (DIF) procedures to detect potentially biased items on its examinations. Although data on examination applicants’ gender has been collected for some time, in the spring of 2013 we began collecting ethnicity data from applicants taking the MC-FP examination so that we could begin to conduct these analyses. DIF procedures are based on the idea that a test item is biased if individuals who have equal ability but are from different subpopulations do not have the same probability of answering it correctly. Although pass rates are an indicator of whether a particular subpopulation is performing at a level comparable to other subpopulations, it is silent with regard to whether the meaning of the scores is stable across subpopulations. These differences could be due to bias in the items that would effectively destabilize the construct. By this we mean that the items, when ordered by difficulty, form a linear construct of less difficult to more difficult. If some items are more difficult or less difficult relative to the other items for a specific subpopulation, then the construct represented by the test is degraded to the extent that the items are disordered for that subpopulation. On the other hand, the hierarchical construct represented by the test could be stable and the difference in pass rates could be due to differences in socioeconomic status and the potential associated inequities inherent in the educational system. DIF analysis permits us to disentangle item-level bias from differences in ability among subpopulations. The process of calibrating test questions with regard to their difficulty for samples from both a subpopulation and the overall population is probabilistic. Therefore, this type of DIF study is best used as a screening tool to find biased items. It does not prove that the items are biased. The ABFM DIF process can be viewed in 3 stages: (1) flagging potentially biased items, (2) examining the content of the flagged questions for sources of bias, and (3) determining their final disposition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Multiple-Variable Matching to Identify EFL Ecological Sources of Differential Item Functioning

Context is a vague notion with numerous building blocks making language test scores inferences quite convoluted. This study has made use of a model of item responding that has striven to theorize the contextual infrastructure of differential item functioning (DIF) research and help specify the sources of DIF. Two steps were taken in this research: first, to identify DIF by gender grouping via l...

متن کامل

Selecting the Best Fit Model in Cognitive Diagnostic Assessment: Differential Item Functioning Detection in the Reading Comprehension of the PhD Nationwide Admission Test

This study was an attemptto provide detailed information of the strengths and weaknesses of test takers‟ real ability through cognitive diagnostic assessment, and to detect differential item functioning in each test item. The rationale for using CDA was that it estimates an item‟s discrimination power, whereas clas- sical test theory or item response theory depicts between rather within item mu...

متن کامل

Differential Item Functioning (DIF) in Terms of Gender in the Reading Comprehension Subtest of a High-Stakes Test

Validation is an important enterprise especially when a test is a high stakes one. Demographic variables like gender and field of study can affect test results and interpretations. Differential Item Functioning (DIF) is a way to make sure that a test does not favor one group of test takers over the others. This study investigated DIF in terms of gender in the reading comprehension subtest (35 i...

متن کامل

A confirmatory study of Differential Item Functioning on EFL reading comprehension

The  present  study  aimed  at  investigating  DIF  sources  on  an  EFL  reading  comprehension test.  Accordingly,  2  DIF  detection  methods,  logistic  regression  (LR)  and  item  response theory  (IRT),  were  used  to  flag  emergent  DIF  of  203  (110  females  &  93  males)  Iranian EFL examinees’ performance on a reading comprehension test. Seven hypothetical DIF sources were examin...

متن کامل

افت در پارامترهای سؤال: مفاهیم، روش‌شناسی و شناسایی

Item Parameter Drift occurs over time for various reasons; when test items lose their initial characteristics, such as difficulty and discrimination parameters. Including cases of item parameter drift are revealed, excessive repetition, changes in the education system, and the position of items and the parameters of poor initialization. Item parameter drift causes of the invariance to be violat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Annals of family medicine

دوره 11 6  شماره 

صفحات  -

تاریخ انتشار 2013